组合优化的神经方法(CO)配备了一种学习机制,以发现解决复杂现实世界问题的强大启发式方法。尽管出现了能够在单一镜头中使用高质量解决方案的神经方法,但最先进的方法通常无法充分利用他们可用的解决时间。相比之下,手工制作的启发式方法可以很好地执行高效的搜索并利用给他们的计算时间,但包含启发式方法,这些启发式方法很难适应要解决的数据集。为了为神经CO方法提供强大的搜索程序,我们提出了模拟引导的光束搜索(SGB),该搜索(SGB)在固定宽度的树搜索中检查了候选解决方案,既是神经网络学习的政策又是模拟(推出)确定有希望的。我们将SGB与有效的主动搜索(EAS)进一步融合,其中SGB提高了EAS中反向传播的解决方案的质量,EAS提高了SGB中使用的策略的质量。我们评估了有关众所周知的CO基准的方法,并表明SGB可显着提高在合理的运行时假设下发现的解决方案的质量。
translated by 谷歌翻译
There has been great recent advancement in human-computer chat. However, proper evaluation currently requires human judgements that produce notoriously high-variance metrics due to their inherent subjectivity. Furthermore, there is little standardization in the methods and labels used for evaluation, with an overall lack of work to compare and assess the validity of various evaluation approaches. As a consequence, existing evaluation results likely leave an incomplete picture of the strengths and weaknesses of open-domain chatbots. We aim towards a dimensional evaluation of human-computer chat that can reliably measure several distinct aspects of chat quality. To this end, we present our novel human evaluation method that quantifies the rate of several quality-related chatbot behaviors. Our results demonstrate our method to be more suitable for dimensional chat evaluation than alternative likert-style or comparative methods. We then use our validated method and existing methods to evaluate four open-domain chat models from the recent literature.
translated by 谷歌翻译
Metaverse over wireless networks is an emerging use case of the sixth generation (6G) wireless systems, posing unprecedented challenges in terms of its multi-modal data transmissions with stringent latency and reliability requirements. Towards enabling this wireless metaverse, in this article we propose a novel semantic communication (SC) framework by decomposing the metaverse into human/machine agent-specific semantic multiverses (SMs). An SM stored at each agent comprises a semantic encoder and a generator, leveraging recent advances in generative artificial intelligence (AI). To improve communication efficiency, the encoder learns the semantic representations (SRs) of multi-modal data, while the generator learns how to manipulate them for locally rendering scenes and interactions in the metaverse. Since these learned SMs are biased towards local environments, their success hinges on synchronizing heterogeneous SMs in the background while communicating SRs in the foreground, turning the wireless metaverse problem into the problem of semantic multiverse communication (SMC). Based on this SMC architecture, we propose several promising algorithmic and analytic tools for modeling and designing SMC, ranging from distributed learning and multi-agent reinforcement learning (MARL) to signaling games and symbolic AI.
translated by 谷歌翻译
Recently, deep learning approaches have been extensively studied for various problems in chemistry, such as property prediction, virtual screening, de novo molecule design, etc. Despite the impressive successes, separately designed networks for specific tasks are usually required for end-to-end training, so it is often difficult to acquire a unified principle to synergistically combine existing models and training datasets for novel tasks. To address this, here we present a novel multimodal chemical foundation model that can be used for various downstream tasks that require a simultaneous understanding of structure and property. Specifically, inspired by recent advances in pre-trained multi-modal foundation models such as Vision-Language Pretrained models (VLP), we proposed a novel structure-property multi-modal (SPMM) foundation model using the dual-stream transformer with X-shape attention, so that it can align the molecule structure and the chemical properties in a common embedding space. Thanks to the outstanding structure-property unimodal representation, experimental results confirm that SPMM can simultaneously perform molecule generation, property prediction, classification, reaction prediction, etc., which was previously not possible with a single architecture.
translated by 谷歌翻译
Single-image 3D human reconstruction aims to reconstruct the 3D textured surface of the human body given a single image. While implicit function-based methods recently achieved reasonable reconstruction performance, they still bear limitations showing degraded quality in both surface geometry and texture from an unobserved view. In response, to generate a realistic textured surface, we propose ReFu, a coarse-to-fine approach that refines the projected backside view image and fuses the refined image to predict the final human body. To suppress the diffused occupancy that causes noise in projection images and reconstructed meshes, we propose to train occupancy probability by simultaneously utilizing 2D and 3D supervisions with occupancy-based volume rendering. We also introduce a refinement architecture that generates detail-preserving backside-view images with front-to-back warping. Extensive experiments demonstrate that our method achieves state-of-the-art performance in 3D human reconstruction from a single image, showing enhanced geometry and texture quality from an unobserved view.
translated by 谷歌翻译
合并个人喜好对于高级机器翻译任务至关重要。尽管机器翻译最近进步,但正确反映个人风格仍然是一项艰巨的任务。在本文中,我们引入了一个个性化的自动后编辑框架来应对这一挑战,该挑战有效地产生了考虑不同个人行为的句子。为了构建此框架,我们首先收集后编辑数据,该数据表示来自Live Machine Translation系统的用户偏好。具体而言,现实世界的用户输入源句子进行翻译,并根据用户的首选样式编辑机器翻译的输出。然后,我们提出了一个模型,该模型结合了APE框架上的歧视器模块和特定于用户的参数。实验结果表明,该方法的表现优于四个不同指标(即BLEU,TER,YISI-1和人类评估)的其他基线模型。
translated by 谷歌翻译
基于医学图像(例如X射线图像)的诊断通常涉及解剖关键的手动注释。但是,这个过程涉及大量的人类努力,因此可以成为诊断过程中的瓶颈。为了充分自动化此过程,基于深度学习的方法已被广泛提出,并在检测医学图像中的关键点方面达到了高性能。但是,这些方法仍然存在临床局限性:无法保证所有情况的准确性,并且医生必须对所有模型的所有预测进行仔细检查。作为回应,我们提出了一个新颖的深神经网络,鉴于X射线图像,它可以通过用户相互作用的系统自动检测和完善解剖学关键点,在该系统中,医生可以以比手动修订过程中所需的点击率更少的点击量来修复错误预测的关键。使用我们自己的收集数据和公开可用的AASCE数据集,我们证明了该方法通过广泛的定量和定性结果来降低注释成本的有效性。我们的项目网页上提供了有关我们方法的演示视频。
translated by 谷歌翻译
图像分类模型通常会学会根据输入功能与培训数据中输出类之间的无关共发生进行预测类。我们称不需要的相关性为“数据偏见”,视觉特征导致数据偏见为“偏见因素”。在没有人类干预的情况下自动识别和减轻偏见是一个挑战。因此,我们进行了一项设计研究,以找到人类的循环解决方案。首先,我们确定了用三个专家捕获图像分类模型的偏差缓解过程的用户任务。然后,为了支持任务,我们开发了一个名为DASH的视觉分析系统,该系统允许用户在视觉上识别偏见因素,使用最先进的图像到图像到图像转换模型迭代生成合成图像,并监督改善分类精度的模型培训过程。我们对十名参与者的定量评估和定性研究证明了破折号的实用性,并为将来的工作提供了教训。
translated by 谷歌翻译
预测交通状况非常具有挑战性,因为每条道路在空间和时间上都高度依赖。最近,为了捕获这种空间和时间依赖性,已经引入了专门设计的架构,例如图形卷积网络和时间卷积网络。尽管流量预测取得了显着进展,但我们发现基于深度学习的流量预测模型仍然在某些模式中失败,主要是在事件情况下(例如,快速速度下降)。尽管通常认为这些故障是由于不可预测的噪声造成的,但我们发现可以通过考虑以前的失败来纠正这些故障。具体而言,我们观察到这些失败中的自相关错误,这表明仍然存在一些可预测的信息。在这项研究中,为了捕获错误的相关性,我们引入了Rescal,Rescal是流量预测的剩余估计模块,作为广泛适用的附加模块,用于现有的流量预测模型。我们的恢复通过使用以前的错误和图形信号来估算未来错误,从而实时校准现有模型的预测。对METR-LA和PEMS-BAY进行的广泛实验表明,我们的恢复可以正确捕获错误的相关性,并在事件情况下纠正各种流量预测模型的故障。
translated by 谷歌翻译
语义上有意义的句子嵌入对于自然语言处理中的许多任务都很重要。为了获得此类嵌入,最近的研究探讨了利用验证语言模型(PLM)作为训练语料库的合成生成数据的想法。但是,PLM通常会产生与人类写的句子大不相同的句子。我们假设将所有这些合成示例同样地用于训练深层神经网络可能会对学习语义上有意义的嵌入产生不利影响。为了分析这一点,我们首先训练一个分类器来识别机器编写的句子,并观察到机器编写的句子的语言特征与人写的句子的语言特征大不相同。基于此,我们提出了一种新颖的方法,该方法首先训练分类器来衡量每个句子的重要性。然后,分类器的蒸馏信息用于训练可靠的句子嵌入模型。通过对四个现实世界数据集的广泛评估,我们证明了我们的合成数据训练的模型可以很好地概括并表现优于现有基线。我们的实现可在https://github.com/ddehun/coling2022_reweighting_sts上公开获得。
translated by 谷歌翻译